{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "cc4c32e4-a21f-475f-83f6-65ff89290461",
   "metadata": {},
   "source": [
    "# Guide: Using Vector Store Index with Existing Weaviate Vector Store"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d32dfbab-e908-4703-a610-e0f1dd251f81",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import weaviate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "9329ab7d-bdbe-45be-aeeb-5d9adb4320b3",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "client = weaviate.Client(\"https://test-cluster-bbn8vqsn.weaviate.network\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "19963148-2eab-4442-8c66-a38b5e09d5d5",
   "metadata": {},
   "source": [
    "## Prepare Sample \"Existing\" Weaviate Vector Store"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "07bf1ee9-4e1b-416b-8781-618bdc025358",
   "metadata": {
    "jp-MarkdownHeadingCollapsed": true,
    "tags": []
   },
   "source": [
    "### Define schema\n",
    "We create a schema for \"Book\" class, with 4 properties: title (str), author (str), content (str), and year (int)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "d029af19-887e-49e3-80a6-f93954656f93",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "try:\n",
    "    client.schema.delete_class(\"Book\")\n",
    "except:\n",
    "    pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "b02d5655-4459-40bd-a9f6-0cdac40044d5",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "schema = {\n",
    "    \"classes\": [\n",
    "        {\n",
    "            \"class\": \"Book\",\n",
    "            \"properties\": [\n",
    "                {\"name\": \"title\", \"dataType\": [\"text\"]},\n",
    "                {\"name\": \"author\", \"dataType\": [\"text\"]},\n",
    "                {\"name\": \"content\", \"dataType\": [\"text\"]},\n",
    "                {\"name\": \"year\", \"dataType\": [\"int\"]},\n",
    "            ],\n",
    "        },\n",
    "    ]\n",
    "}\n",
    "\n",
    "if not client.schema.contains(schema):\n",
    "    client.schema.create(schema)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "a816b14e-7839-4fb6-8844-dba07daaf07a",
   "metadata": {
    "jp-MarkdownHeadingCollapsed": true,
    "tags": []
   },
   "source": [
    "### Define sample data\n",
    "We create 4 sample books "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "6d7c9840-2255-4fbd-b2e2-f5af6a30c5f9",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "books = [\n",
    "    {\n",
    "        \"title\": \"To Kill a Mockingbird\",\n",
    "        \"author\": \"Harper Lee\",\n",
    "        \"content\": \"To Kill a Mockingbird is a novel by Harper Lee published in 1960...\",\n",
    "        \"year\": 1960,\n",
    "    },\n",
    "    {\n",
    "        \"title\": \"1984\",\n",
    "        \"author\": \"George Orwell\",\n",
    "        \"content\": \"1984 is a dystopian novel by George Orwell published in 1949...\",\n",
    "        \"year\": 1949,\n",
    "    },\n",
    "    {\n",
    "        \"title\": \"The Great Gatsby\",\n",
    "        \"author\": \"F. Scott Fitzgerald\",\n",
    "        \"content\": \"The Great Gatsby is a novel by F. Scott Fitzgerald published in 1925...\",\n",
    "        \"year\": 1925,\n",
    "    },\n",
    "    {\n",
    "        \"title\": \"Pride and Prejudice\",\n",
    "        \"author\": \"Jane Austen\",\n",
    "        \"content\": \"Pride and Prejudice is a novel by Jane Austen published in 1813...\",\n",
    "        \"year\": 1813,\n",
    "    },\n",
    "]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "2d96f974-39cb-4388-a315-e73b7ccf01db",
   "metadata": {
    "jp-MarkdownHeadingCollapsed": true,
    "tags": []
   },
   "source": [
    "### Add data\n",
    "We add the sample books to our Weaviate \"Book\" class (with embedding of content field"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "1c6aa5bf-466f-4a08-913f-18b65f2e75dd",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "\n",
    "embed_model = OpenAIEmbedding()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "51baeaa1-9bba-4421-b978-8fa0bfafe4f3",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "with client.batch as batch:\n",
    "    for book in books:\n",
    "        vector = embed_model.get_text_embedding(book[\"content\"])\n",
    "        batch.add_data_object(data_object=book, class_name=\"Book\", vector=vector)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "f7d26d83-dc3d-4b07-9faa-da3246ee46eb",
   "metadata": {},
   "source": [
    "## Query Against \"Existing\" Weaviate Vector Store "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "bae39494-5c9f-43e7-8b26-f1afbd62c60a",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from llama_index.vector_stores import WeaviateVectorStore\n",
    "from llama_index import VectorStoreIndex\n",
    "from llama_index.response.pprint_utils import pprint_source_node"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "8647c689-5964-446f-9e5e-1da620483cef",
   "metadata": {},
   "source": [
    "You must properly specify a \"index_name\" that matches the desired Weaviate class and select a class property as the \"text\" field."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "59093426-0fda-46d0-84b5-8b84d6a81fe8",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "vector_store = WeaviateVectorStore(\n",
    "    weaviate_client=client, index_name=\"Book\", text_key=\"content\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "d8f06305-726b-45b8-bca7-e30e6d84524a",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "retriever = VectorStoreIndex.from_vector_store(vector_store).as_retriever(\n",
    "    similarity_top_k=1\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "9cf3b766-b8bb-4537-815e-1c62821a173b",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "nodes = retriever.retrieve(\"What is that book about a bird again?\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "e23ec43a-6472-4ac0-9183-fb3d109d401d",
   "metadata": {
    "tags": []
   },
   "source": [
    "Let's inspect the retrieved node. We can see that the book data is loaded as LlamaIndex `Node` objects, with the \"content\" field as the main text."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "f8ef9d87-b205-43a7-9fbf-c22e89e704e0",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Document ID: cf927ce7-0672-4696-8aae-7e77b33b9659\n",
      "Similarity: None\n",
      "Text: author: Harper Lee title: To Kill a Mockingbird year: 1960  To\n",
      "Kill a Mockingbird is a novel by Harper Lee published in 1960......\n"
     ]
    }
   ],
   "source": [
    "pprint_source_node(nodes[0])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "97c6a8f0-3cde-4414-85fd-497ab6fe4248",
   "metadata": {
    "tags": []
   },
   "source": [
    "The remaining fields should be loaded as metadata (in `metadata`)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "6ecff7e3-a82c-4bac-a8b3-5d00035cbeab",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'author': 'Harper Lee', 'title': 'To Kill a Mockingbird', 'year': 1960}"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "nodes[0].node.metadata"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
